Meta-programming Applied to Automatic SMP Parallelization of Linear Algebra Code

نویسندگان

  • Joël Falcou
  • Jocelyn Sérot
  • Lucien Pech
  • Jean-Thierry Lapresté
چکیده

We describe a software solution to the problem of automatic parallelization of linear algebra code on multi-processor and multi-core architectures. This solution relies on the definition of a domain specific language for matrix computations, a performance model for multiprocessor architectures and its implementation using C++ template metaprogramming. Experimental results asses this model and its implementation on sample computation kernels.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

OpenTS: An Outline of Dynamic Parallelization Approach

The paper is dedicated to an open T-system (OpenTS) — a programming system that supports automatic parallelization of computations for high-performance and distributed applications. In this paper, we describe the system architecture and input programming language as well as system’s distinctive features. The paper focuses on the achievements of the last two years of development, including suppo...

متن کامل

Experiments with Cholesky Factorization on Clusters of SMPs

Cholesky factorization of large dense matrices is an integral part of many applications in science and engineering. In this paper we report on experiments with different parallel versions of Cholesky factorization on modern high-performance computing architectures. For the parallelization of Cholesky factorization we utilized various standard linear algebra software packages and present perform...

متن کامل

Effective Automatic Parallelization and Locality Optimization Using The Polyhedral Model

Multicore processors have now become mainstream. The difficulty of programming these architectures to effectively tap the potential of multiple processing units is wellknown. Among several ways of addressing this issue, one of the very promising and simultaneously hard approaches is automatic parallelization. This approach does not require any effort on part of the programmer in the process of ...

متن کامل

Final Report : Compiler - Driven Performance Optimization and Tuning for Multicore Architectures Report Title

Final Report: Compiler-Driven Performance Optimization and Tuning for Multicore Architectures Report Title The widespread emergence of multicore processors as the computing engine in all commodity platforms presents our field with an enormous software development crisis. For over two decades, sequential software applications have enjoyed the free-ride of performance improvement with each new pr...

متن کامل

A novel compiler support for automatic parallelization on multicore systems

The widespread use of multicore processors is not a consequence of significant advances in parallel programming. In contrast, multicore processors arise due to the complexity of building power-efficient, high-clock-rate, single-core chips. Automatic parallelization of sequential applications is the ideal solution for making parallel programming as easy as writing programs for sequential compute...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008